A novel neural-evolutionary framework for predicting weight on the bit in drilling operations

This study compares the performance of artificial neural networks (ANN) trained by grey wolf optimization (GWO), biogeography-based optimization (BBO), and Levenberg–Marquardt (LM) to estimate the weight on bit (WOB). To this end, a dataset consisting of drilling depth, drill string rotational speed, rate of penetration, and volumetric flow rate as input variables and the WOB as a response is used to develop and validate the intelligent tools. The relevance test is applied to sort the strength of WOB dependency on the considered features. It was observed that the WOB has the highest linear correlation with the drilling depth and drill string rotational speed. After dividing the databank into the training and testing (4:1) parts, the proposed LM-ANN, GWO-ANN, and BBO-ANN ensembles are constructed. A sensitivity analysis is then carried out to find the most powerful structure of the models. Each model performs to reveal the relationship between the WOB and the mentioned independent factors. The performance of the models is finally evaluated by mean square error (MSE) and mean absolute error criteria. The results showed that both GWO and BBO algorithms effectively help the ANN to achieve a more accurate prediction of the WOB. Accordingly, the training MSEs decreased by 14.62% and 24.90%, respectively, by applying the GWO and BBO evolutionary algorithms. Meanwhile, these values were obtained as around 9.86% and 9.41% for the prediction error of the ANN in the testing phase. It was also deduced that the BBO performs more efficiently than the other technique. The effect of input variables dimension on the accuracy and training time of the BBO-ANN clarified that the most accurate WOB predictions are achieved when the model constructs with all four input variables instead of utilizing either three or two of them with the highest linear correlation. It was also observed that the training stage of the BBO-ANN model with four input variables needs a little more computational time than its training with either two or three variables. Finally, the accuracy of the BBO-ANN model for the WOB prediction has been compared with the multiple linear regression, support vector regression, adaptive neuro-fuzzy inference systems, and group method of data handling. The statistical accuracy analysis confirmed that the BBO-ANN is more accurate than the other checked techniques.

The choice to use drilling processes in the upstream petroleum sector instead of more expensive activities is a crucial decision.This strategic shift holds significant importance and could lead to a substantial reduction in operational costs 1,2 .There are different parameters in the drilling rig operation (i.e., technical, economic, and human parameters) that have a high influence in this regard 3 .In the drilling industry, the relationship between drilled depth and time can be highly changed using different parameters.This relationship arises from the rate of penetration in the rock 2,4 .In the case of the drilling program, the meticulous selection of parameters that influence the rate of penetration into the formation is crucial and can potentially augment the penetration rate of the drill into the rock.The higher drilling rate can be due to the higher rate of penetration, which can decrease drilling time and finally decrease the total operation cost (e.g., Improving and predicting the impressive parameters can enhance the penetration rate for a specific zone until around 50 m that drilling rate is around 37 m per day.Since the wells of Iran are classified as deep wells, improvement and prediction of penetration rates in drilling wells are very significant 5 . The most appropriate penetration rate within the formation can be obtained by utilization of calculating approaches that can increase the drilling rate and may lead to lower costs by comparing and studying the impressive parameters by taking into account the mud utilized and the drill bit and hydraulics type as well as the rig power 6 .The least error can be achieved by considering impressive factors of the volume flow rate of drilling fluid, the formation rate of penetration, density of drilling fluid drill string rotational speed (RPM), the pressure of standpipe, size/type of drill bit along with kind of formation the relation among the weight on the drill bit and the drilling factors.
The above-mentioned parameters consist of the following factors: (1) It is known that the higher used weight does not presently mean a higher drilling rate.Weight on a bit is the weight exerted on the drill bit that is named weight on bit (WOB).The most important factor of the drilling rate is when the driller attempts to identically use the optimized weight of the drill string for the bit under the well bottom (e.g., in some sticky substrates, the higher weight leads to the bit falling within the layer.It can barricade the dig of the formation and the drill string rotation, and sometimes causes the drill string traps in the form or even cutting the drill string.(2) The rate of Penetration (ROP) into the formation shows the drilling rate.In this way, the drill adjustment attempts to retain it at the highest feasible level to the lowest cost along with the lowest risk of the operation.ROP in an operation team is commonly called and introduced as the per meter depth drilled (with the unit of time).(3) The drilling depth (Depth) along with the type and hardness of the formation have been classified as the factors that affect the size and type of bit (Rock Bit or PDC Bit).( 4) The volumetric flow rate of drilling fluid (GPM) is used with the driller based on drilling states.It includes the size of the drill bit as well as the diameter of the driller nozzles, depth, and pressure of the standpipe.
Researchers are trying to retain it at a constant amount.(5) The density of drilling fluid has been considered as one of the factors that is commonly used when hydrostatic pressure in the well depth is more than the formation pressure that prohibits the fluid formation flow of the well.It is commonly attempted to identify and use the optimized fluid density during drilling and determining the optimized fluid density while taking into account the formation layer and well situations.(6) The density of drilling fluid is calculated as the unit of pound per cubic foot.In addition, for drilling operations, the fluid density is named mud weight (i.e., MW). ( 7) Considering drilling conditions such as depth and formation type, one of the crucial parameters that is sought to be utilized optimally is the rotational speed of the drill string, measured in rounds per minute (rpm).( 8) The formation type that is being drilled varies in the case of geology.This leads to the enhancement of the drilling depth.The layers of formation regulation are usually similar in distinct points of a particular oilfield and even sometimes at the same depths, various wells commonly have the same and unchanged structures.(9) Another factor that affects the drilling rate and states is the fluid pressure pumped in the drill string.The drilling fluid can be pumped by taking into account the reservoirs in the drilling string.The drilling fluid, upon exiting the drill bit, should return to its initial level in the reservoirs, both in the annular space surrounding the drill string and within the well.The fluid pressure exerted by the drill depends on several factors, such as the density of the drilling fluid, the effectiveness of the pumps, the pressure drop across the drill nozzles, the pressure drop within the drill string, and the volumetric flow rate determined by the driller.
Many scholars are concerned about the optimization and simulation of the procedure utilizing the common mathematical methods due to the high variation of the parameters involved [7][8][9][10][11] .It is important to note that simulating systems by usual mathematical patterns like differential equations for complex systems that contain uncertainties is known as a relatively good method without high effective performance.Hence, many studies have used artificial intelligence and deep learning models to approximate key variable of different processes, including the drilling operations [12][13][14][15][16] .
Furthermore, some drawbacks of typical intelligent models have driven scholars to employ hybrid evolutionary algorithms to achieve a better approximation of the desired variable 17 .In this sense, Anemangely and Ramezanzadeh optimized the performance of an artificial neural network (ANN) by using a cuckoo optimization algorithm (COA) and particle swarm optimization algorithm for predicting the drilling rate 18 .They concluded that the combination of the neural network with COA could achieve high accuracy in the mentioned utilization.Moraveji and Naderi successfully employed a bat algorithm for identifying the optimal range of parameters for maximizing the drilling rate of penetration 8 .
This research applies machine learning techniques to predict the WOB as a function of different combinations of depth, RPM, ROP, and GPM as the input variables.Firstly, the relevance test sorted the impact of each independent variable on the WOB.Then, two wise optimization algorithms, namely grey wolf optimization (GWO) and biogeography-based optimization (BBO), are applied to improve the prediction capability of artificial neural networks in estimating the WOB.Although ANNs have been promisingly used, the authors came across a few studies that conduct the optimization of ANNs for prevailing its computational drawbacks in the mentioned field.Therefore, two novel ensembles of GWO-ANN and BBO-ANN are designed, and their best structure is determined by executing a sensitivity analysis.The effect of input variables dimension on the accuracy and training time of the best neural-evolutionary framework has also been investigated in this study.The prediction accuracy of the best neural-evolutionary framework is compared with the traditional ANN, multiple linear regression (MLR), support vector regression (SVR), adaptive neuro-fuzzy inference systems (ANFIS), group method of data handling (GMDH).

Data preparation and statistics
2250 data samples, comprising four independent variables (i.e., drilling depth, RPM, ROP, and GPM), as well as WOB as the dependent variable, are used to develop the intelligent models of this study.The key statistical parameters related to this databank are summarized in Table 1.
The histogram of all independent and dependent variables as well as their scatterplots (known as the pair plot) is presented in Fig. 1. 1800 samples (80% of the whole dataset) are used to perform the learning phase of the proposed models.Indeed, this learning phase helps the intelligent tools to understand the relationship between the WOB and its influential factors.Then, the capability of the developed networks is evaluated by the remaining 450 samples (20% of the whole dataset), which were not seen by the trained models in the learning phase.This stage, which is known as the testing phase, monitors the generalization ability of the trained models against some unseen data samples.www.nature.com/scientificreports/

Feature importance analysis
Feature importance analysis is a very important step before the development of predictive models to estimate the target WOB.This analysis helps understand the level of WOB sensitivity on independent factors.The results of applying Pearson method 19 on the available databank to reveal the importance level of features have been depicted in Fig. 2.This figure clarifies that the Depth and RPM are the most important features for determining the target, i.e., WOB.
The p-values between WOB and each independent variable are reported in Table 2.This analysis claimed that depth and RPM are highly significant statistically.In addition, ROP and GPM have been identified as very significant and not significant variables, respectively.

Methodology
After providing a proper dataset, it is divided into two parts for training and testing the proposed ANN, GWO-ANN, and BBO-ANN networks.To develop the mentioned ensembles, the GWO and BBO optimization techniques are coupled with the ANN.Then, the best structures are used to estimate the WOB.The results are evaluated utilizing two statistical criteria, i.e., mean square error (MSE) and mean absolute error (MAE).Equations ( 1) and ( 2) formulate the MSE and MAE.Finally, the results of the ensembles are compared with the typical ANN to examine the effect of the applied metaheuristic algorithms.Moreover, the predictive mathematical equation of the most capable network is presented 21 .in which the superscripts of act and cal denote the actual and calculated values of WOB, respectively.The term N represents the number of instances. (1)

Artificial neural network
The ANN algorithm can be used to solve different non-linear engineering problems 22 .McCulloch and Pitts 23 suggested the idea of this algorithm, which mimics the biological neural procedures.As other machine learning methods, two different groups of data are needed to develop the ANN.The first one includes the most data named training dataset that is employed for pattern recognition via developing mathematical relations.The second group analyzes the performance of the developed network named testing database.Multi-layer perceptron (MLP) is known as one of the most appropriate methods between various types of ANNs that were employed in many studies 24 .In the present paper, we have used the algorithm of LM (Levenberg-Marquardt) 25 , GWO, and BBO to accomplish the training phase of MLP neural networks.A typical MLP structure related to our study is shown in Fig. 3.
After performing the training method in the process of BP, the efficiency error is calculated based on the difference between the exact and estimated response variable.For updating the computational weights (W) as well as biases (b), the efficiency error is propagated backward.We consider Out, T = {Depth i , RPM i , ROP i , GPM i ; i = 1, 2, …, N}, and F as output vectors, input matrix, and the activation function of the MLP model.After that, the jth computational unit performance (i.e., the neurons of the MLP) can be described as follows: It should be mentioned that the current study uses the tangent sigmoid (tansig) and linear activation functions in the hidden and output layers, respectively 26 .

Grey wolf optimization
The grey wolf optimization has been suggested as a progressive evolutionary algorithm 27 .GWO follows the behavior of grey wolf herds, which is a type of social hierarchy.For the first time, this algorithm is suggested by Mirjalili et al. 28 .A flowchart of this algorithm is shown in Fig. 4. A group of wolves in the wolf flock (named alpha, α) is the head of the herd involving several male and female wolves that are decision-makers flocks for hunting and searching.A group of wolves in the wolf flock (named beta, β) assist and obey the first group of decisionmakers.The Beta wolves establish the discipline of the herd, and this is their main duty.When alpha (α) wolves retire or die, beta (β) wolves can substitute instead of them.The next group of wolves that act as scouts, sentinels, hunters, etc., and have the weakest relationships and babysitters are named delta (δ) (omega (ω) wolves).Omega (ω) wolves are the weakest wolves in the flock, and it is likely to see internal fights without omega (ω) wolves.These wolves can participate in hunting, and this is their main social behavior with a social hierarchy 29,30 .
Three base stages of the algorithm of GWO are (a) approaching the target, following, detecting, and (b) circumambienting the target, as well as (c) attacking the prey 31 .Alpha (α) is used as the highest-fitted solution, and after that β, δ, and ω.The encompassing formula is obtained as below: where t stands for the time of iteration.A new position for a wolf is proposed by the vector of − → D .The − → F and − → S stand for the coefficient vectors, and − → P p and − → P show the prey and wolf positions, respectively. (3) The typical MLP structure.
We assume that δ, α, and β are more trustworthy knowledge around the prey place, the most appropriate solutions for considered positions might be updated.After that, other wolves can then register the positions.
Finally, in this method, when a target stands the attack is accomplished.Note that the F includes random variables among − 2α and 2α.This shows that the wolves attack the target ( |A| < 1) or look for a better one ( |A| > 1) 29 .www.nature.com/scientificreports/

Biogeography-based optimization
Simon 32 introduced biogeography-based optimization that comes from the biogeography knowledge and dispensation of various species.This model has been classified as a population-based search method, and Mirjalili et al. 33 applied this algorithm to MLP to optimize its efficiency.Figure 5 denotes the flowchart of the BBO.This model begins using the generation of a so-called random population called "habitat." These parameters show possible solutions that have been obtained from the habitat suitability index (HSI).Moreover, the suitability index variable (SIV) can be used to evaluate the habitability of the habitats and zones.More precisely, an SIV shows the population of the candidate solutions and is classified as a group of real numbers.The BBO extracts two distinct practices (i.e., migration and mutation).The basic purpose is to improve the quality of the possible solutions by enhancing them according to other existing solutions.In this stage, λ g , an immigration rate, is considered for deciding about the need for correction of each SIV 34 .In addition, emigration rates (μ g ) can be used to choose the solution that migrates.It should be noted that other metaheuristic methods are retained constant away from this act to pass probabilistic corruption 35 .Abrupt variations in the case of HSI values can be due to some normal hazards threatening geographical zones.The habitat likely strays due to equilibrium HIS.This method is called mutation, and the probability of each species count determines its rate.This probability is expressed as below: in which S stands for the species number for a region.
To decide whether the relation will change or not, any population relation becomes a possibility.This additionally shows whether the relation is a solution to the existing issue or not.The proximity of the solution to the final solution can be obtained by probability value.A higher probability value means that the solution is near to the overall solution 36 .( 13) www.nature.com/scientificreports/

Stopping criteria
The involved optimization algorithms follow an iterative procedure to minimize the deviation between actual and predicted values of the WOB in the learning phase.The weights and biases of the ANN model are continuously updated by the backpropagation method to reduce this deviation as much as possible.Reaching the maximum number of iterations, converging to a prespecified accuracy, and experiencing a minimum error gradient are the most well-known criteria for stopping the learning phase.

Results and discussion
As explained, this study applies two novel optimization algorithms to perform the learning stage of artificial neural networks for estimating the WOB.To this end, after providing the required data, utilizing the programming language of Matlab (Version, 2019b), the GWO and BBO metaheuristic algorithms are coupled with an MLP neural network to find the best computational parameters of this tool.More specifically, the mathematical equation of the MLP is introduced as the main problem with the weights and biases as the variables.The mentioned algorithms aim to find the most appropriate values of the weights and biases, which produce the most consistent responses.Then, the optimized ANN is reconstructed by applying the new parameters.

Finding the optimal structure of intelligent models
It is well understood that diverse structures of intelligent models yield distinct results.Therefore, in order to pinpoint the most dependable configurations for the proposed GWO-ANN and BBO-ANN ensembles, a comprehensive process of trial and error was conducted.As explained supra, both GWO and BBO are population-based techniques.Ten different structures, i.e., with ten population sizes varying from 50 to 500 with 50 intervals, were designed and tested within 1000 repetitions.Remarkably, the MSE error criterion was defined as the cost function (CF) to evaluate the accuracy of the models.Over time, the individuals of the optimization algorithms aim to find a more fitted solution, which results in a decrease in error.It is worth noting that to evaluate the repeatability of the used models, each network was performed five times.The convergence curves of the implemented GWO-ANN and BBO-ANN are presented in Fig. 6a,b, respectively.As the charts illustrate, both GWO and BBO reduced the majority of errors in the first 300 iterations.Comparing the accuracies, it was revealed that the lowest MSE was obtained for the GWO-ANN and BBO-ANN with population sizes of 150 and 450, respectively.However, other population sizes presented a close MSE.Taking into account the computational time, the elite GWO-ANN took about 37.13 min to find the most proper structure of the ANN.This time was obtained about 107.62 min for the best BBO-ANN network.

Model assessment
Then, the MSE and MAE of the prediction responses of the implemented LM-ANN, GWO-ANN, and BBO-ANN in the training and testing stages were calculated to compare their efficiency.The results are shown in Fig. 7a-f.In this figure, the graphical comparison between the actual and predicted WOBs is presented, as well as the calculated error (i.e., indicating the difference between the targets and outputs) and the histogram of the errors.
As is seen, all three models performed satisfactorily in both recognizing and predicting the pattern of the WOB influenced by depth, RPM, ROP, and GPM.Moreover, the calculated MSEs indicate that the GWO and BBO algorithms reduced the learning error of the LM-ANN by 14.62% and 24.90% (from 16.96 to 14.48 and 12.73), respectively.Also, the obtained MAEs attest to these results with 7.70% and 10% (from 3.07 to 2.85 and 2.70) decrease in the mean absolute value of the ANN training error, respectively, by applying the GWO and BBO algorithms.
As for the testing phase, it was revealed that the reinforced ANN enjoys more prediction capability (in comparison with the typical one), regarding the observed decrease in the calculated MSEs and MAEs.Accordingly, the testing MSE declined by 9.86% and 9.41% (from 15.52 to 13.99 and 14.06).Additionally, these values were obtained as 2.05% and 5.82% (from 2.92 to 2.86 and 2.75) for the testing MAEs.Furthermore, the cross-plots associated with the estimated WOBs by the LM-ANN, GWO-ANN, and BBO-ANN are presented in Fig. 8a-c, respectively.
From a comparison point of view, it is deduced that the BBO surpassed the GWO in optimizing the ANN.In other words, the solution provided by the BBO algorithm for solving the mathematical equation of the ANN developed (for estimating the WOB) was more efficient than what was found by the GWO technique.
As mentioned, the BBO-ANN is the most accurate model of this study, but for answering the question "Which one of the neural-based ensembles is more efficient?"the complexity and time-effectiveness should be considered as well.It was shown that the BBO needs more time (about 2.9 times) and a larger population (3 times) compared to GWO to achieve the best solution.Notably, the difference between the learning error (i.e., the MSE) was approximately 2 units.Here, when time or computational ease comes out as determinant factors, utilizing the GWO is a better choice for optimizing the ANN, especially when it has provided satisfying accuracy.However, when the accuracy of the prediction is a more crucial factor, employing the BBO seems more reasonable.

Investigating the effect of input variable dimension
Pearson's method has previously been applied to sort the strength of WOB dependency on the depth, RPM, ROP, and GPM.This section investigates the effect of different combinations of input variables on both the accuracy and training time of the BBO-ANN model.As Table 3 shows, the BBO-ANN accuracy toward WOB predictions decreases by decreasing the input variable dimension from four (Depth, RPM, ROP, GPM) to two (Depth, RPM).It can also be seen that decreasing the input variable dimension has a slight impact on the BBO-ANN training time.Indeed, the training stage of the BBO-ANN model with four, three, and two input variables takes 31.17,30.72, and 30.42 min, respectively.

Comparison with other techniques
This section compares the BBO-ANN accuracy toward the WOB prediction with MLR 37 , SVR 38 , ANFIS 39 , and GMDH 40 .Equation ( 14) introduces the linear correlation between WOB and all four input variables.Table 4 relies on the presented MAEs and MSEs by the BBO-ANN, MLR, SVR, ANFIS, and GMDH in the training and testing steps to compare the models' accuracy.This comparison analysis clarifies that the BBO-ANN accuracy in both the training and testing stages is better than all the other checked techniques.After the BBO-ANN, the adaptive neuro-fuzzy inference system with a cluster radius of 0.5 is the best intelligent tool to predict WOB from Depth, RPM, ROP, and GPM.This

Extracting the neural predictive formula
Lastly, due to the superiority of the BBO-based ensemble, the governing formula of this network, composed of the ANN optimized parameters, is extracted and presented in Eq. ( 15).This equation addresses the mathematical rule established in the last layer of the used MLP (i.e., the output layer).Also, the optimized weights and biases of the middle layer (i.e., hidden layer) are presented in Table 5.where Z 1 , Z 2 , …, and Z 9 are calculated as follows:

Model interpretability
The drilling industry will greatly benefit from applying this novel neural-evolutionary framework for determining WOB in drilling operations.In the upstream oilfield industry, where cost-effective drilling operations are vital, an accurate calculation of WOB is essential for improving drilling processes.The model could represent the complex interactions affecting WOB by including drilling depth, drill string rotational speed, rate of penetration, and volumetric flow rate.With the use of this knowledge, drilling program decisions may be made with great consideration, resulting in parameters that considerably impact the rate of penetration into the formation.Its adaptability results from the framework's success in predicting WOB under various conditions, including those where mud qualities, drill bit types, and hydraulics are taken into account.Using such sophisticated prediction models can result in better-informed and optimized drilling procedures, which will eventually help the drilling industry succeed as it seeks out cost-and efficiency-saving opportunities.
The application of our study holds tremendous potential for operational companies looking for cutting-edge solutions in the complex and expensive world of drilling operations.Automated drilling rigs, often known as "smart" drilling rigs, are one example.These rigs use cutting-edge technology to automate several drilling operations, such as wellbore navigation and directional drilling 41 .These systems frequently use real-time data analysis and machine learning to improve drilling circumstances and make decisions.Real-time data analytics tools are also being integrated more and more frequently.These platforms analyze data from various sensors and drilling equipment to deliver quick insights into drilling performance, empowering operators to make decisions based on data immediately 42 .

Conclusions
The main objective of this study was to investigate the capability of the evolutionary optimization algorithm to accomplish the training process of artificial neural networks to predict the WOB as a function of different combinations of depth, RPM, ROP, and GPM.To this end, grey wolf and biogeography-based optimization algorithms were coupled with the ANN to adjust weights and biases.Based on the executed sensitivity analysis, the BBO-ANN model needed a larger population size and more computational time than the GWO-ANN.The obtained error criteria of MSE and MAE revealed that both applied algorithms perform satisfactorily in enhancing the learning capability of the ANN.Moreover, the BBO surpassed GWO in prevailing the computational drawbacks of the ANN, and also, the BBO-ANN produced the most consistent results, followed by GWO-ANN and LM-ANN.The effect of input variable dimension on the accuracy and training time showed that the BBO-ANN provided better WOB prediction when it was constructed based on all four input variables instead of utilizing those with the highest linear correlation (i.e., depth and RPM or depth, RPM, and ROP).In addition, the constructed BBO-ANN predicted the actual WOB data with better accuracy than the MLR, ANFIS, SVR, and GMDH.The BBO-ANN formula was presented to be used for directly estimating the WOB, influenced by four effective factors, including drilling depth, drill string rotational speed, rate of penetration, and volumetric flow rate.

Figure 2 .
Figure 2. Results of the feature importance analysis.

Figure 4 .
Figure 4.The working procedure of the GWO algorithm.

Figure 5 .
Figure 5.The operation procedure of the BBO algorithm.
ANFIS model estimates the training dataset with MAE = 2.96 and MSE = 15.51.It also predicts the testing group with MAE = 2.95 and MSE = 15.32.The support vector regression with the linear kernel is also the worst model among the tested tools to estimate the WOB.It has MAE = 3.65 and MSE = 21.48 in the training step and MAE = 3.60 and MSE = 21.42 in the testing step.

Table 1 .
Statistical analyses of the WOB and effective parameters.

Table 3 .
Checking the effect of input variable dimension on the BBO-ANN model performance.* Intel(R) Pentium(R) CPU G4400 @ 3.30 GHz, RAM 4.00 GB.

Table 4 .
Comparing the BBO-ANN reliability for the WOB prediction with other techniques.

Table 5 .
Optimized weight and biases of the BBO-ANN model.